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We discuss a robust data analysis method to detect a stochastic background of gravitational waves 
| in the presence of non-Gaussian noise. In contrast to the standard cross-correlation (SCC) statistic 

, frequently used in the stochastic background searches, we consider a generalized cross-correlation 

■ (GCC) statistic, which is nearly optimal even in the presence of non-Gaussian noise. The detection 

efficiency of the GCC statistic is investigated analytically, particularly focusing on the statistical 
relation between the false-alarm and the false-dismissal probabilities, and the minimum detectable 
amplitude of gravitational-wave signals. We derive simple analytic formulas for these statistical 
quantities. The robustness of the GCC statistic is clarified based on these formulas, and one finds 
that the detection efficiency of the GCC statistic roughly corresponds to the one of the SCC statistic 
neglecting the contribution of non-Gaussian tails. This remarkable property is checked by performing 
the Monte Carlo simulations and successful agreement between analytic and simulation results was 
found. 
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A stochastic background of gravitational waves is expected to be very weak among various types of gravitational- 
i . wave signals. Such a tiny signal is produced by an incoherent superposition of many gravitational-wave signals 
coming from the irresolvable astrophysical objects and/or diffuse high-energy sources in the early universe. Up to 
• • , now, various mechanisms to produce stochastic signals have been proposed and their amplitudes and spectra are 
estimated quantitatively (for the review see Ref. [H,12l)- 

Despite the small amplitude of the signals, the stochastic backgrounds of gravitational waves contain valuable 
cosmological information about cosmic expansion history and astrophysical phenomena. Because of its weak interac- 
tion, the extremely early stage of the universe beyond the last scattering surface of the electromagnetic waves would 
be probed via the direct detection of inflationary gravitational-waves background. In this sense, gravitational-wave 
backgrounds are an ultimate cosmological tool and the direct detection of such signals will open a new subject of 
cosmology. 

As a trade-off, detection of stochastic background is very difficult and the challenging problem. Recently, the obser- 
vational bound of stochastic background has been updated by Laser Interferometer Gravitational Wave Observatory 
(LIGO 1) third scientific run [4[ and the amplitude of signal is constrained to r2 gw < 8.4 x 10 4 , where f2 gw is the 
energy density of gravitational wave divided by the critical energy density. While this is the most stringent constraint 
obtained from the laser interferometer this bound is still larger than the limit inferred from the big-bang nucle- 
osynthesis. Hence, for the direct detection of stochastic signals, a further development to increase the sensitivity is 
essential. To do this, one obvious approach is to construct a more sophisticated detector whose sensitivity level is 
only limited by the quantum noises. Next-generation of ground-based detectors, such as LIGO II and Large-scale 
Cryogenic Gravitational- wave Telescope (LCGT) Q, will greatly improve the sensitivity that reaches or may beat 
the standard quantum limit. Furthermore, the space-based interferometer will be suited to prove gravitational wave 
backgrounds due to its lower observational band Q ■ Another important direction is to explore the efficient and the 
robust technique of data analysis for signal detection. 
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In this paper, we shall treat the latter issue, particularly focusing on the signal detection in the presence of the 
non-Gaussian noises. When we search for the weak stochastic signals embedded in the detector noise, we have no 
practical way to discriminate between the detector noise and a stochastic signal by using only a single detector. 
To detect a stochastic signal, we must combine the two outputs at different detectors and quantify the statistical 
correlation between them. This cross-correlation technique is the robust statistical method that is still useful in 
the cases with large detector noises. The so-called standard cross correlation technique has been frequently used 
in the data analysis of laser interferometers. Note that the standard cross correlation statistic was derived under 
the assumption that both the signals and the instrumental noises obey stationary Gaussian process [1, |jj [T^j. In 
practice, however, gravitational wave detectors do not have a pure Gaussian noise. Because of some uncontrolled 
mechanisms, most experiments exhibit a non-Gaussian tail. In the presence of non-Gaussianity, the direct application 
of the standard cross-correlation statistic significantly degrades the sensitivity of signal detection. A more appropriate 
cross-correlation statistic to reduce the influence of the non-Gaussian tails should be desirable in the data analysis of 
signal detection. 

In Refs. [HI, [13, the standard cross-correlation analysis was extended to deal with more realistic situation. They 
found that such a modified statistic shows a better performance compared to the standard cross-correlation statistic 
[111 ]. This modified statistic is called the locally optimal statistic Roughly speaking, the usual standard cross- 
correlation statistic uses all detector samples, while the locally optimal statistic excludes the samples of the non- 
Gaussian tails outside the main Gaussian part from the detector samples. As a result, the statistical noise variance 
in the locally optimal statistic becomes small due to the truncation of the samples of the non-Gaussian tail, so that 
the effective signal-to-noise ratio becomes large. 

In this paper, we derive analytical formulas for the false-alarm and the false-dismissal probabilities and the minimum 
detectable signal amplitude to quantify the performance of the locally optimal statistic. Then, we demonstrate 
the detection efficiency of locally optimal statistic in a simple non-Gaussian noise model, in which the probability 
distribution of the instrumental noise is described by the two-component Gaussian noise. Based on the analytical 
formulas, the efficiency of the locally optimal statistic is quantified compared to the standard cross-correlation statistic. 

The structure of this paper is as follows. In the next section, we briefly review the detection strategy for a stochastic 
background. We then introduce the generalized cross-correlation statistic which is nearly optimal in the presence of 
non-Gaussian noise. In Scc lIIII particularly focusing on the two-component Gaussian model as a simple model of non- 
Gaussian noises, we analytically estimate the false-alarm and the false-dismissal probabilities. Based on this, we obtain 
the analytic expression for the minimum detectable amplitude of stochastic signals. The resultant analytic formulas 
imply that the detection efficiency of the GCC statistic roughly corresponds to the one of the SCC statistics neglecting 
the contribution of non-Gaussian tails. These remarkable properties are checked and confirmed by performing the 
Monte Carlo simulations in Sec lIVl Finally, in Sec|Vl we close the paper with a summary of results and a discussion 
of future prospects. 

II. OPTIMAL DETECTION STATISTIC IN THE PRESENCE OF NON-GAUSSIAN NOISE 

As we previously mentioned, the gravitational-wave background (GWB) signal is expected to be very week and is 
usually masked by the detector noises. To detect such tiny signals, it is practically impossible to detect the GWB 
signal from the single-detector measurement. Thus, we cross-correlate the two outputs obtained from the different 
detectors and seek a common signal. We denote the detector outputs by with 

«J=\*+nJ, (i = l,2, k = l,---,N), (1) 

where i = 1,2 labels the two detectors, and k ~ 1, • ■ ■ , N is a time index. Here, h\ is the gravitational- wave signal, 
whose amplitude is typically e, and is the noise in each detector. The N x 2 output matrix S is made up of these 
outputs. Throughout this paper, we discuss the optimal detection method under the assumption of weak signal, i.e., 
|^|~e«K|. 

A. Detection statistic 

To judge whether a gravitational signal is indeed present in detector outputs or not, the simplest approach is to 
use a detection statistic A = A(<S). When A exceeds a threshold A*, we think that the signal is detected, and not 
detected otherwise. The statistic A, which is made up of random variables S. exhibits random nature under the finite 
sampling and because of this, we have two types of error depending on the detection criterion A*. The probabilities 
of these errors are often called false-alarm rate and false-dismissal rate. The probability of the false alarm is the one 
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that we conclude to have detected a signal, but the signal is in fact absent. We denote the probability by Pfa[A*]. 
On the other hand, the probability of the false dismissal which we denote by Pfd[A*] is the probability that we fail 
to detect a signal even though the signal is in fact present. Thus, one may say that the detection statistic is optimal 
only when the two errors are minimized. Neyman and Pearson showed that the likelihood ratio is the optimal decision 
statistic that minimizes Pfd for a given value of Pfa [3 ■ The likelihood ratio is given by 

P (s\oy [2) 

Here, the quantity p(S\e) is the probability distribution function of the observational data set S in the presence of 
the signal, whose amplitude is given by e. We are specifically concerned with the detection of weak signals. In such 
a situation, regarding e as a small parameter, one can expand A as 

A= l + eA 1 + e 2 A 2 + 0(e 3 ). (3) 

As long as e is small, the higher-order terms of 0(e 2 ) arc neglected and the quantity Ai approximately becomes the 
optimal decision statistic. This statistic is called the locally optimal statistic If Ai becomes zero, then A 2 is the 
optimal decision statistic. 



B. Standard and generalized cross-correlation statistics 



In order to obtain some insights into the locally optimal statistic, we consider the simplest situation for the data 
analysis of signal detection. For any two detectors, we assume that their orientations are coincident and coaligncd 
without any systematic noise correlation between them, so that two detectors receive the same signal, i.e., h k = h k = 
h k . There are several missions that realize such a situation. The ongoing LIGO project has two colocated detectors 
in the Hanford site, although the arm length of each detector is different [|[. The LCGT detector proposed by the 
Japanese group also has two colocated detector sharing a common arm cavity @. 

In addition to the orientation of the detectors, we further assume that each detector has a white and stationary 
noise. In this case, the joint probability distribution of the detector noises is given by 

N 

p n {Af) = J] e-Atf-^-M-S-fc*), (4) 
fc=i 

where the symbol Af represents the noise contribution to the output matrix S. Note that Eq.Q reduces to a 
multivariate Gaussian distribution if the function fi becomes quadratic in its argument. Thus, the function fi other 
than the quadratic form implies the non-Gaussianity of the detector noises. As for the probability of the signal 
amplitude, we also assume that the signal is white, so that the probability distribution function for Ti = h 1 , h N is 
expressed by 



Ph(H) = l[ Phk (h k ). (5) 

fc=i 

From Eqs. ([I]), ([4]) and ([5]), the numerator in the likelihood ratio ([2]) is given by 

p(S\e) = fdh 1 ■■■ f dh N p h (H) Pn (AO . (6) 



Exp anding the likelihood ratio with respect to \h k \ ~ e <C 1 around zero, we obtain the locally optimal statistic 
[ill . fl2j |. In the present case, Ai in Eq. ©, which includes the linear term of the signal, vanishes because the 
stochastic gravitational wave is usually a zero-mean signal. Therefore, A 2 turns out to be the optimal decision 
statistic. A 2 is composed of second derivative terms and some quadratic of the first derivative terms with respect to 
s k . We then classify these terms into single- detector statistic and two-detector statistic j7l| . The former statistic, 
which is described by quantities such as f i and (/j) 2 , are only relevant in the cases when the gravitational- wave 
signal dominates the detector noises. The latter two-detector statistic is given by 

1 N 

Agcc <x /^t), (7) 
fe=i 
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where we used the fact that the signal is white. In this paper, we especially call it generalized cross-correlation (GCC) 
statistic 1 . 

In what follows, for the purpose of our analytic study, we treat the non-Gaussian parameters in the function /, as 
known parameter. Furthermore, we define the counterpart of the GCC statistic in the absence of non-Gaussianity as: 

1 N 

A scc = -£ s f s |, (8) 
fe=i 

which we call the standard cross-correlation (SCC) statistic. Strictly speaking, the decision statistics derived here 
are not the optimal decision statistics floL Il5j . For instance, in the case of the Gaussian noises with unknown 
variances, the optimal decision statistic differs from Eq. {SJ by a factor , where a% is the square-root of the 

autocorrelation function for the output signals s\, i.e., of = (l/N) J2k( s i) 2 - Nevertheless, in the large-sample limit 
N — > 00, statistical fluctuations in the autocorrelation function become negligible relative to those in Agcc and the 
autocorrelation functions can be treated as constants. Thus, in the limit N — > 00, the factor (<7i<72) _1 is irrelevant 
and one can identify Eq. © as the optimal decision statistic [15|. In this sense, Eq. ([5]) may be regarded as an nearly 
optimal statistic. Although it seems difficult to prove that the statistic Aqcc really approaches the (locally) optimal 
statistic in the large-sample limit with the non-Gaussian noises, the essential properties in the statistic Aqcc is the 
same as those in the optimal decision statistic derived from Bayesian treatment [lfj. We hope that the resultant 
analytic formulas for detection efficiency are also useful in the practical situation that we do not know the noise 
parameters a priori. 



III. ANALYTIC ESTIMATION OF THE DETECTION EFFICIENCY 



We wish to clarify how the GCC statistic improves the detection efficiency in the presence of non-Gaussian noise 
in an analytic way. For this purpose, we treat the simple non-Gaussian model, in which the probability distribution 
of the detector noises is characterized by the two-component Gaussian distribution given by [111 [l6j : 

Pn,i(x) = e~ Mx) = %fMe-* 2 ' 2 <* + _^_ e - 2 /2^ , (* = 1, 2) . (9) 

The left panel of Fig. Q] illustrates the probability distribution function of ((9]). This model can be characterized by the 
two parameters, i.e., the ratio of variance, (o't.j/o'm,*) 2 and the fraction of non-Gaussian tail, Pi. Here, Pj means the 
total probability of the non-Gaussian tail. Of particularly interest is the case that 0t i/<7 m j > 1 and Pj -C 1. Thus, 
the detector noise is approximately described by the Gaussian distribution with the main variance ofj i7 but to some 
extent, it exhibits the non-Gaussian tail characterized by the second component of the Gaussian distribution with a 
large variance of f . The examples of this situation are illustrated in the right panel of Fig.[TJ 

On the other hand, we assume that the probability function of the stochastic signal is simply described by the 
Gaussian distribution with zero mean and with a small amplitude of the variance e 2 : 

p hk (h k ) = -L e-CW. (10) 
V 2ire 



Using these probability distribution functions, we derive the analytical formula for detection efficiency of the GCC 
statistic, i.e., Pfa-Pfd curve and minimum detectable amplitude of the signal for gravitational- wave background. 



A. Pfa versus Pfd curve 



In order to quantify the detection efficiency of the GCC statistic and compare it with that of the SCC statistic, it 
is convenient to compute the Pfa-Pfd curve. For any detection statistic A, the false-alarm and the false-dismissal 



1 Although we extracted the cross-correlation term by hand, the Bayesian derivation automatically eliminates the self-correlation terms 
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probabilities, Pfa and Pfd are expressed as 



/>oo 

Pfa[A*] = / dx P { °\x), 

JA* 

/>oo 

P FD [A*] = 1- / dap^a:). 



(11) 



Here, p^\x) and p^(x) are the probability distributions of the decision statistic in the absence and the presence of 
the signal, respectively. Thus, the Pfa-Pfd curve is simply obtained from Eg. pip as the parametric function of the 
threshold A*. According to the Neyman and Pearson criterion, the best strategy to detect the stochastic signal is to 
choose the optimal statistic that minimizes the Pfd for a given value of Pfa- In other words, if the Pfd of the GCC 
as function of Pfa is always smaller than that of the SCC, the GCC statistic is said to be more optimal compared to 
the SCC statistic. 

In the large-sample limit (N 1), the central-limit theorem would be applicable and the probabilities p^ and p^ 
can be treated as a Gaussian function. We then have 



1 



/2^AA( r ) 



exp 



(x- (A( r ))) 2 
2(AA( r )) 2 



, (T = 0,1) 



(12) 



Here and in what follows, quantities (A^ )) and [AA^ ^ 2 denote the mean and the variance for a decision statistic in 
the absence of signal, while (A^) and [AA^] 2 are the mean and the variance for a decision statistic with a signal. 
From Eqs. (fTTj) and (fl~2|) . the Pfa-Pfd curve is given by 



P 



FD 



-crfc 



v / 2erfc- 1 [2P FA ] 



(AW) (A(°>) } 1 AA<°> 

aaw + aaW/ 7f aaTtj 



(13) 



Here, erfc[x] is the complementary error function defined by 



erfc[x] 



dze z . 



(14) 



Note that in the case of the SCC statistic, the quantity (A^^/AA' ' just coincides with the usual meaning of the 
signal-to-noise ratio (SNR). In general, the false-dismissal probability Pfd is a decreasing function of the quantity 
(A*- 1 ') / AA(°) for a given probability of false alarm Pfa- 
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FIG. 1: Left: Probability distribution function of instrumental noise given by Eq.©. The model parameters are set to <r m = 1, 
at = 4 and P = 0.1. Here we dropped the detector label i. Right: Time-series data of the non-Gaussian noise generated by Eq. 
@. Top panel shows the result with tail fraction P = 0.01, while the bottom panel plots the case with larger value, P = 0.1. 
In both panels, we specifically set the model parameters <r m and at as a m = 1 and at = 4. For comparison, we also plot the 
weak signal of the stochastic gravitational waves with e — 0.1. 
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FIG. 2: Derivative of the function f(x) in the two-component Gaussian noise model. Left panel shows the dependence of 
the tail fraction Pi keeping the ratio of noise variance fixed, i.e., <Tt,i/&m,i = 4. On the other hand, right panel presents the 
dependence of the ratio crt^/cm,* keeping the tail fraction fixed, i.e., Pi — 0.1. 



B. Mean and Variance for detection statistic 



Next, 

Our task is to calculate the means and the variances for the detection statistic, i.e., (A' T ') and [AA' r )] . In order 
to compare the performance of the GCC statistic to that of the SCC statistic, we first consider the means and the 
variances for the SCC statistic. From Eqs. |T]) and (f5|)- (fTTJ)) . the ensemble averages become 
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Next, we calculate the means and the variances for the GCC statistic ([7]). For the non-Gaussian model ^ of the 
instrumental noises, the derivative fl(x) in Eq. J7]) is given by 



{l-Pi)+Pi(<T m ,i/(Tt,ife 



3 p x K 



(1 - Pi) + Pi(o- m ,i/o- M )e 



>)/2 



(20) 



The expression (|20p seems rather intractable to further develop the analytical calculation. However, in the situations 
we are interested in, i.e., oVi/oWi > 1 and P ?; <C 1, the above function simply behaves like f[{x) « x/af n i for small 
value of |x| and fl(x) f=a xjo\ i for large value of \x\. Thus, one may apply the two-step approximation to the function 
(1201) as: 



f'M) 



\st\ < Fcr,i|, 



> X 



(21) 



cr.i I 



Here, the quantity x CTi i is the critical value that characterizes the boundary between small |s^| and large \sf\. Note 
that we adjust the overall factor of the function fl(sf) so as to coincide with the SCC statistic ([5]) in the limit 

\x C r,i\ -> OO. 

Fig. [21 shows the dependence of the function f'(x) on the model parameters Pi (left) and Ot^/crrM [right). As 
decreasing the tail fraction or increasing the ratio cr t i /a m ^, the asymptotic behavior of f'{x) steeply changes from 
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x l a m i t° x / a t i around the inflection point of f'{x). Hence, it seems reasonable to set the critical value x c 
inflection point of f'(x). Then, the quantity x CTt i is approximately expressed as 




log 



P 



1 - Pi °t. 
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(22) 



Here, we only considered the solution satisfying the condition (2cr,i/cm,i) > 1. 

Adopting the critical value Eq. (|22|). with a help of two-step approximation, the means and the variances of the GCC 
statistic can be analytically calculated. The details of the calculation are presented in Appendix [A] The resultant 
expressions become 
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Here the quantity erf[x] is the error function. In deriving Eqs. (|23M2"6"|) , we have neglected contributions of the integral 
from the region [x cr .i,oo]. In Ref. [Tl| . this treatment is called clipping. The explicit expressions of the higher-order 
terms in Eq. (|25p are given in Appendix [AJ These terms turn out to be subdominant if the non-Gaussian parameters 
become Pi < 0.2 or at.i/cr m .i ^ 3. In what follows, we neglect the higher-order terms in Eq. (f25|) unless otherwise 
stated. 

Now, we substitute the expressions Eqs. (|2^1) - ([2^|) into Eq. (fT3")) . The analytic Pfa-Pfd curve for the GCC statistic 
is written as 
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(30) 



In the expression (|29p . we have introduced the auxiliary quantities p and A to clarify the differences between the 
GCC and the SCC statistics. Obviously, the ratios p e s/p and A e ff/A become unity when the probability distribution 
of noises is Gaussian, leading to the Pfa-Pfd curve for SCC statistic. Thus, the deviation of these quantities from 
unity characterizes the efficiency of the GCC statistic. 

Fig. [3] shows the ratio p e &/p as the function of u t /a m for various tail fraction P. To plot the curves, just for 
simplicity, we assume that two detectors are identical: 



Cm = C m i — <7 m . 2, 



Ot,l = Ot,2, P = Pi = P 2 , 



^cr — ^cr.l *^cr,2- 



(31) 



In Fig. [31 the ratio p e ff/ P is always larger than unity for any values of P and cr t /a m . Recall that the quantity p has 
the usual meaning of the SNR, this result implies that the clipping taken in the GCC statistic always leads to a larger 
effective SNR than that of the SCC statistic. On the other hand, when we evaluate the quantity A e ff/A, one finds 
that this ratio is always less than 1. These two facts indicate that the false-dismissal probability Pfd of the GCC 
statistic is always smaller than that of the SCC statistic. Note also that A e g ps 1 and A r* 1 as long as the signal e 
is small. Thus, for a good approximation, we can set A g to unity. Hence, the performance of the GCC statistic is 
mainly attributed to the ratio p e s/p- 

Based on this consideration, in Fig. 01 we plot the analytic Pfa-Pfd curves for various signal amplitudes. Here, 
the parameters P, at/a m and N are specifically chosen to P = 0.01, <J t /a m = 4 and N = 10 4 . The solid and dotted 
lines represent the Pfa-Pfd curves for the GCC and the SCC statistics, respectively. In each signal amplitude e, the 
false-dismissal probability Pfd of the GCC statistic is always smaller than that of the SCC statistic for any Pfa- As 
expected, the performance of the GCC statistic improves as the parameter e increases. 
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FIG. 4: Analytic Pfa-Pfd curves for the standard and generalized cross-correlation statistics in the presence of the non- 
Gaussian noises described by the specific model ((9}. The sold (dashed) lines represent the Pfa--Pfd curves for the GCC (SCC) 
statistic for the stochastic signals with amplitude e = 0.03, 0.06 and 0.12 (top to bottom). Here, we assume that the two 
detectors are identical (see Eq. (j3lj). For each curves, the parameters are set as P = 0.01, <T t /er m = 4 and N = 10 4 . 



C. Minimum detectable amplitude 



In addition to the Pfa--Pfd curves, the minimum detectable amplitude of the stochastic signal, edctcct is a direct 
measure to quantify the performance of the detectability. In order to estimate this statistically, we must first specify 
the threshold values (Pp\, Ppr>) called detection point jl5j |. For given threshold values, the minimum detectable 
amplitude edctcct can be uniquely determined from Eg. ([29]) . For simplicity, we set -PpA = ^FD- The resultant amplitude 



for the GCC statistic, e^^. t is 
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FIG. 5: The function G plotted against the ratio a t /a m in the case of two identical detectors (see Eq. (01}). Here, the tail 
fraction P is specifically chosen as P = 0.01, 0.05, 0.1 and 0.2 from top to bottom. The thick and thin lines are the function 
G defined in Eq. (|34|) and the one taking account of the higher-order terms (IA9j) . respectively. 



where we have assumed A e ff = 1 . The quantity 7 is given by 7 = erfc 1 [2Pp A 1 and the amplitude e^eU means the 
minimum detectable amplitude for the SCC statistic in the large N limit [15[ : 
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In Eq. (|32p . the important quantity is the function G characterizing the gain compared to the amplitude ejotcct- 
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The function G becomes unity when the noise probability functions reduce to the Gaussian distribution. It also 
approaches unity if the ratio of the noise variance <7t ; i/<7 mj i becomes unity. For the stochastic signal e, we have the 
relation fl gw cx e 2 cx SNR. Thus, the minimum detectable O gw using the GCC statistic is improved by a factor G 2 , 
compared to that of the SCC statistic. 

In Fig. [51 the thick lines show the quantity G as function of at/a m in the case of two identical detectors (see Eq. (f3"Tj) l. 
The thin lines represent the same plot, but we have taken account of the higher-order terms (|A9[) in Appendix [A"l 
As the tail fraction becomes smaller and the ratio er t /(7 m becomes larger, the thick lines tend to approach thin lines. 
The quantity G monotonically decreases as increasing the ratio <7t/er m or the tail fraction P. Specifically, for the 
parameters P = 0.1 and <7t/er m = 10, we obtain G ~ 0.35. This implies that the sensitivity to the stochastic signal is 
improved by a factor 10 in terms of SNR, compared to the sensitivity achieved with the SCC statistic. 

In the situation with Pj <C 1 and (er m ,i/ct.i) < 1, a more compact form of the approximation for G 2 is found : 



G 



)g(«|)( 



n 



1 - p 



(1 - P) + P (o-t.i/tTn 



1/2 



(35) 



Thus, when the quantity Pi (ct,i/cn 
SCC statistic. 



i i) 2 is larger than unity, the GCC statistic can become more powerful than the 



IV. MONTE CARLO SIMULATION 



In this section, we perform Monte Carlo simulations of the cross-correlation analysis and compare the Pfa-^fd 
curves and the minimum amplitude edotcct from the analytic estimates with those obtained from the numerical simu- 
lations. For the rest of this paper, we specifically assume that the two detectors are identical and satisfy the condition 
CUD- 
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A. Algorithm of Monte Carlo simulation 

Our Monte Carlo algorithm basically follows Ref. [l5[ . We numerically calculate the false-alarm and false-dismissal 
probabilities Pfa and Pfd by conducting an ensemble over the N CHl]ml simulated experiments. For each experiment, 
we randomly generate two kinds of (N x 2) matrix S made up of the detector outputs, in which one output contains 
stochastic signals and other data contain only the instrumental noises. We then compute the decision statistic in the 
presence or the absence of the stochastic signals. Choosing the threshold for the decision statistic, we obtain Pfa-Pfd 
curve. The details of the algorithm are summarized as follows (see also Ref. [l5j ) : 

• Generate two kind of (N x 2) data matrix S : 

For a specific parameter set (P, er m , at, e, N), we first generate the N data train which only contains the instru- 
mental noises, i.e., = n\ (i = 1, 2, k = 1, • • • , N). These random data are created according to the probability 
distribution function ©. We then duplicate the data train and further add the stochastic signals (Eq. (fTT)|) ). to 
the one data train, i.e., s.f = h\ + n\ (i = 1, 2, k = 1, • • • , N). 

• Compute the decision statistics Aq CC and A gcc from the matrix S for T = and 1: 

Based on the expressions (|7|) and ©, under a prior knowledge of the noise parameters (P, a m , at), we compute 

the decision statistics Aq CC and A gcc from the data matrix S in both absence and presence of the stochastic 
signals (T = 0, 1). Note that the derivative f-(x) in Eq. is given by Eq.([2"0|). 

• Set a threshold value A* to determine a point (P F a[A*], Pfd[A*]) for GCC and SCC: 

For a given value A*, we increase Pfa by the factor 1/AT CHUNK when the condition A' ' > A* is satisfied. Also, 
we increase Pfd by 1/N CHUNK if the relation A^ < A* holds. These operations are performed in each case of 
the GCC and the SCC statistics by varying the threshold value A* . 

• Repeat the above steps N CHUNK times to estimate the probabilities (Pfa [A*], Pfd [A*]) for various threshold 
values A*. 

In the simulations presented below, the numbers of samples and trials are set to N = 10 4 and N c 

HUNK = 5 X 10 , 

respectively. Note that the N = 10 4 samples roughly correspond to the data points appropriate for the low- frequency 
detector like Laser Interferometer Space Antenna (LISA) [l7| , for which 1 year observation and the effective bandwidth 
10 _3 Hz are assumed. Below, we will present the results under keeping the noise variance a^ = 1 fixed. 

B. Simulation results and discussion 

Let us first show Pfa-Pfd curves. In Fig. [SI the symbols denote the simulated Pfa- -Pfd curves for GCC (left) and 
SCC (right) statistics in a variety of the tail fractions P. Here, the signal amplitude e = 0.1 and the ratio of the root 
of noise variance at/a m = 4 are kept fixed. Basically, the false-dismissal probability for a given Pfa becomes large 
as the tail fraction increases. However, for fixed P, the false-dismissal probabilities of the GCC statistic are always 
smaller than that of the SCC statistic. In left panel of Fig. [51 the three thick lines indicate the analytic Pfa-Pfd 
curves without the higher-order terms in Eq. (|25[) . which quantitatively agree with the Monte Carlo simulations. A 
closer look at the results for GCC statistic for the tail fraction P = 0.2 shows a small discrepancy between analytic 
and simulation results, which is mainly attributed to the higher-order terms neglected in the analytic results. The thin 
line in left panel of Fig. [5] show the same analytic Pfa-Pfd curves, but we take into account the higher-order terms 
(|A9[) . where the agreement becomes excellent. Note that, most of the gravitational-wave detectors have a fairly small 
non-Gaussian component and the analytic formulas for P <C 1 without the higher-order terms would be applicable in 
practice. 

Fig. [7] shows another plot of the Pfa-Pfd curves. In each panel, fixing the tail fraction P to 0.1, the dependence on 
the ratio at/a m is depicted, in which both the analytic and the simulation results yield the similar trends. From this 
figure, performance of the GCC statistic seems remarkably good. Even for larger non-Gaussian tails, the Pfa-Pfd 
curves for GCC statistic almost remain unchanged. On the other hand, the SCC statistic gets worse significantly as 
increasing the ratio a t /a m > 1. This is indeed anticipated from the behavior of the quantity p e s/p in Eq. (|29[) (see 
Fig.©. 

Turning to focus on the minimum detectable amplitude, we plot in Fig. [8] the dependence of the amplitude edotcct 
on the tail fraction P (left) and the ratio of variance at/a m (right). In this plot, we specifically set the detection point 
to (PpAi Pfd) = (0.1, 0.1). Note that for numerical investigation of the amplitude edetect, we ran the Monte Carlo 
simulation several times and vary the amplitude e to find the point satisfying the condition (Pfa, Pfd) = (0.1, 0.1) 
until the accuracy with a few percentage has been achieved. In each panel, the solid and dotted lines represent the 
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FIG. 6: Pfa--Pfd curves for the GCC (left) and the SCC (right) statistics. Symbols denote the simulation results, while the 
lines indicate the analytic prediction from Eq. (|29[) . In each panel, the ratio of the noise variance is fixed to Ot/<Tm = 4 and the 
amplitude of stochastic signal is set to e = 0.1. Note that for the tail fraction P = 0.0, corresponding to the Gaussian noise 
case, the solid line and the filled circles in left panel are identical to the one in right panel: P — 0.0(filled circles and solid); 
P = 0.05 (open circles and dotted); P = 0.2 (filled squares and dashed). The thin dashed line for P = 0.2 indicates the analytic 
Pfa-Pfd curve taking account of the higher-order terms (|A9|) . 




FIG. 7: Same as in Fig. [6] but we here plot the dependence on the ratio ot/<7 m , fixing the tail fraction and the amplitude of 
stochastic signals to P = 0.1 and e = 0.1: o~t/&m = 2 (filled circles and solid); o~t/o~ m = 4 (open circles and dotted); o~t/o~ m = 8 
(filled squares and short- dashed); crt/<T m = 16 (open squares and long-dashed). 



analytic estimates of the minimum amplitude for GCC and SCC statistics, respectively (Eqs. (f32j) . ([33])). The thin 
line in left panel shows the analytical prediction including the higher-order terms (|A9j) . For the smaller tail fraction 
P < 0.1, the analytic results for GCC statistic reasonably approximate the simulation results and the resultant 
amplitude edetect is insensitive to the non-Gaussian tails. On the other hand, the minimum amplitude of SCC statistic 
increases in linearly proportional to the ratio of noise variance <7t/o~ m . This remarkable feature is precisely what we 
expected from the analytic estimate of the minimum detectable amplitude (see Sec. IIII Cl and Fig. O. That is, the 
dependence of the ratio cr t /a m on the functions G and e^Sect a l mos t cancels out each other, leading to the insensitivity 
of e^otcct- Since the two-step approximation in our analytic formulas becomes a good description for a larger value 
Ct/c m , as long as the tail fraction P is small, the analytic estimation of e^etect provides a robust and a quantitative 
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FIG. 8: Minimum detectable amplitude of the gravitational-wave signals as function of the tail fraction P (left) and the ratio of 
noise variance o"t/o"m (right). The ratio of noise variance in left panel is specifically chosen as <Tt/cr m = 4, while the tail fraction 
in right panel is set to P = 0.1. In both panels, filled (open) circles represent the simulation results derived from the GCC 
(SCC) statistic. The corresponding analytic curves are also shown in solid and dotted lines based on the expressions (|32|l and 
(|33[) . The thin line in left panel is the analytical prediction including the higher-order terms (|A9[) . Note that in these plots, 
detection point is specifically set to (Pfa, -Ffd) = (0.1, 0.1) with sample points N = 10 4 . 



prediction for the detection efficiency of the GCC. 



V. SUMMARY 



In this paper, we discussed the robust data analysis method to detect a stochastic background of gravitational wave 
in the presence of the non-Gaussian noise. Specifically, we have discussed the generalized cross-correlation (GCC) 
statistic which is a nearly optimal statistic and quantified the detection efficiency in an analytic manner. To do this, 
we have focused on a simple but realistic non-Gaussian noise model, i.e., two-component Gaussian noise. We derived 
the analytic formulas for the false-alarm and the false-dismissal probabilities as a function of threshold value A* and 
obtained the Pfa--Pfd curves. Also, we derived the minimum detectable amplitude of stochastic signal, edctoct- These 
analytic results are compared with the Monte Carlo simulations for the cross-correlation analysis and found that the 
analytic formulas provide a good description. 

For small tail fraction Pi < 0.1, from Eqs. (|32[) - (|54"|) . minimum detectable amplitude of the stochastic signal for 
GCC statistic is related to that of the SCC statistic: 



where the quantity eJjjSect become 



with 7 being 7 = erfc -1 [2Pfa]. Thus, these two equations indicate that the minimum amplitude of GCC statistic is 
mainly determined by the main part and is insensitive to the tail part of the noise probability distribution. Therefore, 
the quantity Cdctcct is almost equivalent to the one derived from the SCC statistic just dropping the contribution of 
non-Gaussian tails: 

ncc „ j 7 n n \ ' 

e detect — \ ijj CT m ,iam j2 } 



Finally, we close this paper with comments and discussions. Throughout the paper, we have considered the two 
coincident and coaligned detectors with the white noise spectra. In practice, these restrictions must be relaxed. 
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According to Refs. [ll|, |l2j, the GCC statistic has been extended to deal with a more realistic situation with non- 
coincident and non-co-aligned detectors of the colored noises. In this context, the analysis in the present paper roughly 
matches the narrow-band analysis in the Fourier domain, where the noise spectrum can be approximately described 
by a white noise. The extension of the present analysis to the broad-band case would be straightforward and this 
should deserve consideration. Another important simplification in our analysis is the stationarity of the instrumental 
noises and neglect of a noise correlation between two detectors. In practice, the noise correlation is known as a big 
obstacle in the LIGO at the Hanford site [l|[ and it would potentially be a serious problem in the future detector, 
LCGT 0. Thus, exploration of optimal data analysis strategy in the presence of not only the non-Gaussian noise 
but also the nonsteady noise and the noise correlation is very important task for future detectors. 

It will be rather difficult to improve the sensitivity of the detectable amplitude by building a more sophisticated 
detector, due to the limitation of available technology and funds. Hence, efficient methods for data analysis such 
as the GCC statistic should be further exploited and it must be properly incorporated into the future detection of 
stochastic gravitational waves. Extending the present work to deal with a more realistic situation, we will continue 
to address these issues. 
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APPENDIX A: ANALYTICAL EXPRESSIONS FOR THE MEANS AND THE VARIANCES FOR THE 

GCC STATISTIC 



In this Appendix, we derive the analytical expressions (|23[) - (|2l)|) for the mean and the variance of the GCC statistic. 
First, we compute the mean and the variance in the absence of signal, i.e., T — 0. Adopting the two-step approxi- 
mation (|21|) with the critical value (|2"2")) . we obtain 



(A 



(0) \ . 
GCC/ — 



AA 



(0) 
GCC 



0, 

("i)g("1)g 
N 



where 



(n?>G 







+2 (W) 4 r dni 




/ drii 


n\Vn.i{ n i) 


n 2 Pn,i( n i) 


1 —x cr i 







(Al) 
(A2) 

(A3) 



In the situation we are interested in, i.e., Pi -C 1 and (<y m .i/ cr t.i) ^5 1 ; the contribution of second term in the right 
hand side of Eq. (|A3[) is negligibly small. Thus, the variance of noise is approximately described by the first term. In 
Ref. [ll|, this effect has been called clipping. Then, we have 



K 2 )g 



n 2 iP n ,i{ n i) 



= (1 - Pi) af n?; PG[x c v,i,CF-m,i\ + Pi C t 2 i Pc[Xcra, <Tt,i] 

Here, the quantity Pg[x, <j] is defined in Eq. (|27l) : 



(A4) 



Pq[x, a] = erf 



\72(7 



2f e -(x/ CT ) 2 /2 



Next, we consider the mean and the variance in the presence of gravitational-wave signals. The mean (Aqq C ) is 
expressed as 



(Agcc) 



N 

-J2 d s U4fM)-f2(A)ps(sl4)- 

k=l J 



(A5) 
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where p s (si, s%) is the joint probability distribution function for the two detector outputs defined by 

p 8 {s\,sl) = J d/i fc d^d^<5(sf-/i fe -nj)(5(s^-/i fe -^)p hfe (/i fe )p„ ! i(nf)p„ !2 (n^). (A6) 

As long as the two-step approximation with clipping holds, the quantity (|A5[) up to C(e 2 ) becomes 

(Ag CC )-(Ag cc )W + (Ag) c )W ) (A7) 



where, 



and 



( A GCC) (£) = e2 {! - ( P l + P»)}PG[x a ,i, <Tm,l] Pg[x C t,2, <T m , 2 ] (A8) 



( A GCC> (h) = e2 ( P l PclXcr.U <Tt,l] Pg [lcr,2 , <7 m , 2 ] + P 2 Pg [aJcr.l , ^[^,2, Ot, 2 ]) 

+ e 2 P 1 P 2 (P G [ ]Pg[ ]-Pg[ ^cr.li Ot,l }Pg[ «£cr,2 5 ^m,2j 

- PG^cr.l, CTm,l] fbfccr.Z, Ot,2] + PG^cr.l, 0" t ,l] PG^cr.2, ft, 2]) ■ (A9) 

Under the situation that P, -C 1 and (<7 m i/c^i) < 1, the critical value ;r cri i defined in Eg. ([220 satisfies the condi- 
tion a IUt i <C x cri j <C c t) i, then Pq[x ci ^, a mi i) and PG[^cr,i, Ct,i] approximately become unity and zero, respectively. 
Therefore, one can regard the term (Aq^q)^ as the negligible higher-order terms. 

Finally, using the two-step approximation with clipping, the leading order result of the quantity AAq^ c becomes 



AA 



&c = VVgcc) 2 ) - (Agcc) 2 (A10) 

w (("l)G(n|)G) 1/2 L 



N V'°{W-g))- (AU) 

Thus, in the present situation that the detector noises dominate the gravitational signal, we can reasonably treat the 
quantity AAq^, c as 

AAg^ «"»>' /2 =AAg» c . (A12) 
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